Close correlation between thiolate basicity and certain NMR parameters in cysteine and cystine microspecies

The imbalance between prooxidants and antioxidants in biological systems, known as oxidative stress, can lead to a disruption of redox signaling by the reactive oxygen/nitrogen species and is related to severe diseases. The most vulnerable moiety targeted by oxidant species in the redox signaling pathways is the thiol (SH) group in the cysteine residues, especially in its deprotonated (S−) form. Cysteine, along with its oxidized, disulfide-containing form, cystine, constitute one of the most abundant low molecular weight biological redox couples, providing a significant contribution to the redox homeostasis in living systems. In this work, NMR spectra from cysteine, cystine, and cysteine-containing small peptides were thoroughly studied at the submolecular level, and through the chemical shift data set of their certain atoms it is possible to estimate either thiolate basicity or the also related standard redox potential. Regression analysis demonstrated a strong linear relationship for chemical shift vs thiolate logK of the cysteine microspecies data. The αCH 13C chemical shift is the most promising estimator of the acid-base and redox character.


Introduction
The imbalance between prooxidants and antioxidant pathways in biological systems, known as oxidative stress, can lead to a disruption of redox signaling by the reactive oxygen/nitrogen species and is related to aging, atherosclerosis, carcinogenesis, diabetes, and neurodegeneration [1,2]. Although the abovementioned reactive oxidizing species have some essential role against infectious pathogens and in cellular signaling systems, their effects are favorable only if they are present in low or moderate concentrations and under tight cellular regulation [3]. The primary chemical moiety targeted by oxidizing species in the redox signaling pathways is that of the thiol-containing cysteine (CysSH or Cys). Cysteine oxidation is notable in pathological mechanisms when an oxidized protein has its activity modified and is vulnerable to aggregation and degradation [4]. The most common form of oxidized cysteine prevalent in biochemical processes is cystine (CysSSCys or (Cys) 2 ), which contains a disulfide bond; a result of oxidation by loss of two electrons from two cysteine species each bearing a thiolate group. The cysteine/cystine redox couple is one of the most abundant low molecular weight redox couples in human plasma, rivaled only by the tripeptide-type glutathione(GSH)/glutathione disulfide(GSSG) redox buffer [5]. In 2016, our research group introduced an indirect method using (species-specific) standard redox potentials to characterize thiolate-disulfide equilibria with pH-independent parameters [6]. This is necessary to thoroughly explore the oxidation process in which only the thiolate moiety participates, untangled from the parallel acid-base processes.
Despite being a pivotal regulator of redox homeostasis and signaling, not all cysteine residues present in proteins are likely to be oxidized; disulfide bridge formation will depend on a few other factors, such as solvent accessibility, pK a (henceforth dealt in terms of protonation constant, logK), polarity of nearby residues, and steric proximity.
In order to better comprehend the biological function of cysteine oxidation and establish an antioxidant therapy, which could eliminate the currently unmet medical need of oxidative stress [7,8], it is inevitable to find new ways to reveal the possible relationships between the subtle and co-dependent redox, acid-base and spectroscopic properties.
A complete microspeciation of cysteamine, cysteine, homocysteine, and their respective homodisulfides has been previously elaborated by means of 1 H NMR-pH titrations allowing the comprehension of the detailed acid-base processes of thiol-containing amino acids at a submolecular level [9,10]. Now we are extending the observed correlation between standard redox potentials and thiolate logK [6] to chemical shift values, to highlight the predictive power of NMR parameters available from relatively simple, single spectroscopic measurements.

Materials
Cysteine and cystine were purchased from Sigma (Merck) and were used without further purification. Deionized water was prepared with a Milli-Q Direct 8 Millipore system. Compounds 1-11 were purchased from ProteoGenix (Schiltigheim, France).
The cysteine derivatives (12)(13)(14)(15) were synthesized on TentaGel R RAM resin (0.19 mmol/ g) with Fmoc-chemistry on a Rink amide linker on a 0.1 mmol scale manually. The coupling of cysteine was performed as follows: 3 equivalents of Fmoc-protected amino acid, 3 equivalents of the uronium coupling agent O-(7-azabenzotriazol-1-yl)-N,N,N 0 ,N 0 -tetramethyluronium hexafluorophosphate (HATU) and 6 equivalents of N,N-diisopropylethylamine (DIPEA) were used in N,N-dimethylformamide (DMF) as a solvent with shaking for 3 h. After the coupling steps, the resin was washed 3 times with DMF, once with methanol and 3 times with dichloromethane. Deprotection was performed with 2% 1,8-diazabicyclo [5.4.0] undec-7-ene (DBU) and 2% piperidine in DMF in two steps, with reaction times of 5 and 15 min. After the deprotection of the Fmoc-group, the resin was washed and a further coupling step was carried out with the appropriate benzoic acid derivative and HATU with DIPEA as coupling agent. The resin was washed with the same solvents as described previously. The cleavage was performed with trifluoroacetic acid/water/DL-dithiothreitol (DTT)/triisopropylsilane (TIS) (90:5:2.5:2.5) at 0˚C for 2 h. The cleavage cocktail is evaporated, and the peptide is  precipitated with diethyl ether. After the precipitation, the cysteine derivatives are dissolved in 10% acetic acid solution and are lyophilized. As a final purification, the solid residue that remained after lyophilization is digerated with diisopropyl ether. The crystals gained by this step were washed with diisopropyl ether.

NMR spectroscopy measurements
NMR spectra were recorded on a Varian Unity Inova DDR spectrometer (599.9 MHz for 1 H) with a 5 mm 1 H{ 13 C/ 31 P-15 N} pulse field gradient triple resonance probehead at 298.15 ± 0.1 K. The solvent was H 2 O:D 2 O 95:5 (V/V), ionic strength was adjusted to 0.15 mol/L with KCl. The pH values were adjusted with HCl or NaOH and determined in situ by internal indicator molecules (at ca. 1 mmol/L) optimized for 1 H NMR [11,12]. The sample volume was 550 μL and every sample contained ca. 1 mmol/L DSS (3-(trimethylsilyl) propane-1-sulfonate) as chemical shift reference. The H 2 O 1 H signal was suppressed with a presaturation sequence; the average acquisition parameters for 1 H measurements are: number of transients = 16, number of points = 65536, acquisition time = 3.33 s, relaxation delay = 1.5 s. 1 H-13 C HSQC measurements were performed with solvent signal presaturation and the following parameters: number of transients = 64, number of increments = 96, number of points = 2884, acquisition time = 149.968 ms, relaxation delay = 1 s.

Statistical analysis
Non-linear regression analyses on the titration data were carried out using R version 4.0.5 (R Foundation for Statistical Computing, Vienna, Austria) [13] with the following function: where δ L is the chemical shift of an unprotonated moiety, δ HL is the chemical shift of the protonated moiety, and logK is the base 10 logarithm of the group-specific protonation constant. Linear regression analyses for the chemical shift-logK data were carried out using the R version 4.0.5 (R Foundation for Statistical Computing, Vienna, Austria) [13]. Table 1. The species-specific chemical shifts (on the ppm scale) determined for cysteine and cystine microspecies. The uncertainty of determination for chemical shift on the ppm scale is 0.001 and 0.01 for 1 H and 13 C chemical shift values, respectively. For the thiol-and disulfide-bearing species, the concomitant thiolate protonation constant given in the third column refer to the relevant thiolate-bearing microspecies that give rise to such thiol-or disulfide-bearing species via protonation or oxidation, respectively. The thiolate protonation constants were determined previously in [9].

Cysteine and cystine species-specific chemical shift data
The acid-base microspeciation schemes-with the symbols of the various species-of cysteine and cystine are presented in Fig 1. The species-specific protonation constants of cysteine and cystine are identical with those in previous works [9,14]. The species-specific NMR chemical shifts of the α CH and β CH 2 nuclei were determined by measuring 1 H and 1 H-13 C HSQC NMR spectra at limiting pH values (corresponding to the plateaus on the titration curves of the compounds, see S1 and S2 Figs). The species-specific chemical shifts of the cysteine and cystine microspecies were determined using Submeier-Reilley equations [15]; this method was recently elaborated for the analogous selenocysteine/selenocystine pair [16]. Briefly, first the chemical shifts were recorded at limiting pH values (i.e. at the plateaus of the titration curve of the compound); these chemical shifts afforded the species-specific chemical shift values of the major microspecies (see major microspeciation pathway in Figs 1 and 3), since the contribution of minor microspecies to the observed chemical shifts is insignificant. The chemical shifts of the major microspecies also afford the protonation shifts (Δδ) associated with the various basic moieties, which in turn allow the determination of the NMR chemical shifts of the minor microspecies as well. The species-specific chemical shifts are compiled in Table 1 grouped according to thiolate-bearing, thiol-bearing, and the complementary disulfide-bearing microspecies, respectively. The thiolate-specific protonation shifts determined for cysteine on the ppm scale are as follows: Δδ 1 H( β CH 2 ) = 0.29 and −0.03; We performed separate multiple linear regression analyses on the data found in Table 1 using the NMR chemical shifts as independent variables and the logK as dependent variable; this result is depicted in Fig 2 with solid scatter points and regression lines. Note that the assignment of independent and dependent variables is not meant to reflect causal relationship between the parameters, but is purely designed to establish a model to predict logK values from chemical shifts. The results showed that for each of the three cases the α CH 13 C chemical shift had the most reliable contribution to the model. It can also be seen from the scatter plots in Fig 2 that this chemical shift has the best fit and predictive potential on the logK values. The parameters of the multiple linear regression analysis are presented in Table 2. It is noteworthy to make certain distinctions between the regression parameters and their interpretation; (a) the adjusted R 2 characterizes the vertical dispersion of the data points around the linear fit and quantifies how much the linear model explains the variability of the data; (b) the slope of the regression line characterizes the degree and direction of response between the dependent and independent variable, i.e. how much is a particular thiolate basicity accompanied by a different chemical shift; (c) thiolate-specific protonation shift is the chemical shift change a nucleus undergoes when the thiolate moiety changes protonation state from unprotonated (thiolate) to the protonated (thiol) form.

Species-specific chemical shift data of cysteine-containing peptides
In order to extend the validity of the multiple linear regression model obtained from cysteine and cystine date, we chose to include other cysteine-derivatives in the analysis; notably Table 3. The species-specific chemical shifts determined for the additional compounds studied (beyond cysteine and cystine) bearing thiolate moiety. The uncertainty of determination for chemical shift on the ppm scale is 0.001 and 0.01 for 1 H and 13 C chemical shift values, respectively. The thiolate protonation constants of glutathione were determined previously in [14], while the protonation constants of the peptides were determined as described in the Materials and Methods section and are reported in logk ± standard deviation of the regression fit.  glutathione and other tripeptides meant to model the varying environments of mid-chain cysteine residues. We assumed that cysteine residues with neighboring amino acids of varying electronic effects (compounds 1-11) would exhibit varying acid-base and NMR characteristics depending on their residue neighbor. Certain non-peptide cysteine derivatives with extremely electron withdrawing conjugates (compounds 12-15) were also chosen in order to extend the logK spectrum in which data points could be acquired. The usual range of cysteine thiolate protonation constants is expected to fall between logK 8 and 10; however, since oxidoreductase enzymes must have reactive cysteine residues bearing unprotonated thiolate moieties for catalysis, there are indeed some instances in which the cysteine thiolate logK was found to be much lower than 7 (i.e. 3.5 or 4) [17][18][19][20]. Therefore, we hoped to extend the logK range of the linear model well below 7 by examining the compounds selected for further investigation, that are listed in Table 3. The species-specific protonation constants of glutathione and glutathione disulfide were imported from previous works [9,14]. The microspecies notations of glutathione microspecies can be found in Fig 3. The thiolate protonation constants of the remaining compounds were determined with 1 H NMR-pH titrations by plotting the 1 H chemical shift of the cysteine α CH vs pH. Non-linear regression analyses afforded the protonation constants using Eq (1). Based on the protonation constants compiled in Table 3, it is apparent that the originally anticipated lower thiolate logK values were not observed in the cysteine derivatives 12-15. Tables 3-5 contain the NMR chemical shift data determined for these additional

Discussion
Cysteine is the most important thiol-bearing amino acid and the pivotal regulator of redox homeostasis and signaling. However, not all cysteine residues in proteins are likely to be oxidized, it will depend on the solvent accessibility, thiolate basicity and polarity of the nearby residues [7]. The analysis of the chemical shift data reveals a direct and inverse relationship between logK and 13 C/ 1 H chemical shifts, respectively. It was also observed that in terms of correlation, the 1 H protonation shifts are considerably lower compared to the 13 C counterparts. Furthermore, there are smaller differences between the reduced and oxidized species in terms of 1 H chemical shifts as well. Contrarily, the 13 C chemical shift data of the α CH reveal the redox state of species as well the relevant physico-chemical properties. Corroborating this finding, in a previous study, Sharma and Rajarathnam already demonstrated that 13 C NMR chemical shifts can clearly indicate disulfide bond structure and recognize the reduced and oxidized state of cysteine [21]. Regarding the protonation constant results of the cysteine-containing peptides (Table 3), it is interesting to observe that the presence of neighboring amino acid residues (even with highly electron withdrawing groups) do not influence the thiolate basicity, i.e. the neighboring Table 5. The species-specific chemical shifts determined for the additional compounds studied (beyond cysteine and cystine) bearing disulfide moiety. The uncertainty of determination for chemical shift on the ppm scale is 0.001 and 0.01 for 1 H and 13 C chemical shift values, respectively. The microspecies symbols for glutathione disulfide are not all depicted on Fig 3; labeling here is assumed to continue on the GSSG microspeciation scheme in alphabetical order, with labeling continuing after Z with AA, AB, and so on. residue on a cysteine has virtually no bearing on the acid-base/redox properties of the cysteine thiolate. This leads us the conclusion that the properties of a cysteine side-chain can only be perturbed via steric interactions in a peptide. The regression analysis presented in Fig 2 reveals a strong linear relationship between chemical shifts and thiolate basicities within the data of cysteine and cystine microspecies, whereas the correlation data from other compounds (Tables  3-5) only show adherence for the case of α CH 13 C. The data from the α CH 13 C nucleus of the studied peptides show the best conformity to the linear correlation of cysteine data; therefore this nucleus is the best possible option to estimate thiolate properties from NMR data. The linear regression fits on this nucleus alone are shown for all studied compounds in Fig 4. The reason why this alpha carbon is the best indicator of thiolate characteristics is probably due to the position of the α CH carbon relative to the sulfur atom as they are optimally connected via two covalent bonds to each other; the optimal covalent distance for NMR reporter nuclei. On the contrary, however, the α CH 1 H chemical shifts of the cysteine are presumably perturbed more by the protonation state of neighboring moieties or the presence of a peptide bond, disqualifying this nucleus to be an indicator of the properties of the sulfur atom. The β CH 2 nuclei also show this phenomenon and have a weaker correlation with logK altogether. These observations hold for the regression analysis of the thiolate bearing species as well as that of the thiol bearing and the concomitant disulfide bearing species. It is often assumed that chemical shifts are highly susceptible to changes in the microenvironment of the NMR active nuclei. Moreover, the correlation from cysteine chemical shifts, logK, and redox potentials could bring better knowledge about the chemistry and the biological function of its oxidation [21]. Through the accrued chemical shift data set and regression analysis, it is also possible to estimate thiolate basicity/thiol acidity and the concomitant standard redox potential.
Nevertheless, the obvious limitation of this method is the window of the regression analysis. Since the species-specific thiolate basicities observed in cysteine are limited to a certain window, and the further analysis of derivative compounds did not extend this range, in order to further the scale of the regression more measurements on larger peptides are needed. We performed a thorough literature search for reported thiolate logK values in the PKAD Database [22] and the corresponding chemical shift values for the cysteine residue using the Biological Magnetic Resonance Data Bank (BMRM) (http://www.bmrb.wisc.edu). Unfortunately, the literature review produced only a handful of data that seems to be unreliable to incorporate into the model. Our research group is currently investigating larger peptides as we hope that the determination of species-specific chemical shifts of added peptides will extend the regression model for better utility.

Conclusion
It was possible to confirm a strong linear relationship within the cysteine microspecies for the chemical shift data vs thiolate, specifically for the α CH 13 C. The next step in improving this model is to analyze peptides with lower thiolate basicity and extend the correlation that can be used on larger proteins in order to estimate acid-base and redox character of cysteine residues using their NMR chemical shifts.